Web-based spell checker

ABSTRACT

A fast client-side spell checker is provided that builds efficient structures out of dictionary and a common misspelling lists and uses the structures to prune the number of searches required to identify misspelled words and provide suggestions for correcting the misspelled words. The spell checker is a browser-based application, which is provided by a server to a client device. The server also sends the dictionary and a list of common misspellings to the client device in the form of efficient data structures. The spell checker utilizes a set of rules to identify the words that are not in the dictionary but are intended to be correct as typed. The spell checker is used by different browser-based applications that utilize the same spell checker regardless of the browser platform used to access the applications. In this way, the spell checker provides a uniform spell checking user experience across different browser platforms.

BACKGROUND

Remote storage and computing services allow users to store data onremote computer servers and access them from multiple devices through anetwork, usually the Internet. In addition, some service providers allowusers to access different applications in order to generate andmanipulate documents. For instance, the users can download applicationssuch as text editors, spreadsheet generators, presentation programs,etc., from the servers and access them in the browser.

Many of these applications are used to enter and manipulate text. Adesired feature for any application that manipulates text is the abilityto flag words that are not spelled correctly and to provide appropriatecorrection suggestions. Most spell checkers include a verificationcomponent and a suggestion component. The verification componentutilizes one or more dictionaries or lists of words that are valid ineach language. Each word in a document is compared against the entriesfor the appropriate language to identify possible misspellings. Thesuggestion component utilizes different algorithms and heuristics toidentify what the user had intended to type and to provide correctionsuggestions.

The use of remote storage and computing services as well the use ofsmaller mobile devices with less computing resources provide severalchallenges for spell checker applications. On one hand the applicationhas to be small for fast download through the Internet and fastexecution on mobile devise with fewer computing resources. On the otherhand, the application has to be able to identify misspelled errors as auser types the words and to quickly provide correction suggestions whenthe user asks for suggestions. Also, applications delivered in languagesthat are universally acceptable by different browsers (e.g., deliveredin JavaScript), are generally less efficient then the nativeapplications and require fast techniques in order to provide acceptableresponse time.

In a remote computing system, one possible solution is to send each wordto the server and allow the more powerful resources of the server to dospell checking. The drawback for this option is the additional networktraffic between the remote device and the server. In addition, sendingthe words to server for spell checking results in spell checking tobecome unavailable when the application is being used in offline mode.

Another possible solution for browser-based applications is to use thespell checking features of the browser. The drawback for this option isthat different browsers have different spell checkers and accessing thesame application through different browsers provides a non-uniformexperience for the user.

BRIEF SUMMARY

Some embodiments provide a fast client-side spell checker. The spellchecker is a browser application that downloads a dictionary and a listof common misspellings in the form of efficient data structures. Thespell checker makes use of the efficient data structures to prune thenumber of searches performed to identify whether a typed word is in thedictionary, to identify whether a typed word is a common misspelling,and/or to provide suggestions for correcting the misspelled words. Thespell checker in some embodiments also uses a set of rules to identifystrings that are not in the dictionary but are valid since the stringsare what the user has intended to type.

The spell checker is used by different browser-based applications thatmanipulate text and require spell checking. These browser-basedapplications use the same spell checker regardless of the browserplatform (e.g., Safari®, Chrome®, Internet Explorer®, Firefox®, etc.)that is used to access the applications. In this way, the spell checkerprovides a uniform spell checking user experience across differentplatforms. In contrast, browser-based applications that reply on thebrowser's spell checker provide different experience for the user eachtime a different browser is used to access the application.

In some embodiments, the spell checker is a browser-based application,which is provided by a remote computing service server to a clientdevice. The server also sends the dictionary and a list of commonmisspellings to the client device. In some embodiments the datastructure used for pruning is a prefix tree (or a tie). The serverbuilds a prefix tree out of the dictionary word list. The server thenencodes the tree in a compress format and sends the tree to the spellchecker in the client. The server also makes a look-up map in the formof a hash table for common misspellings and sends the table to the spellchecker in some embodiments. In other embodiments, the server makes aprefix tree for the common misspellings and sends to the spell checker.

The spell checker decompresses the prefix tree and uses the dictionaryprefix tree to check if a string is a dictionary word. The spell checkeralso uses the dictionary prefix tree for suggestion generation. Thespell checker uses the common misspelling hash table (or the prefix treein the embodiments that utilize a prefix tree for the commonmisspellings) to determine whether a string is a common misspelling. Insome embodiments, the spell checker is written in a browser compatiblelanguage such as JavaScript. In some embodiments, the prefix tree andthe hash table are kept in browser cache and are used at a later time aslong as the cached data structures are considered valid according tocache update rules.

A prefix tree is an ordered tree data structure where the keys arestrings. The position of a node in the tree defines the key with whichthe node is associated. In a prefix tree, all descendants of a node havea common prefix of the string associated with that node and the root isassociated with the empty string. Not every node in the dictionaryprefix tree is associated with a valid word in the dictionary. Forinstance, the word “captain” is represented by seven nodes in the tree.The first node represents character “c” and is a child of the root node.The second node represents character “a” and is a child of the firstnode, etc. The seventh node, which is associated with the valid word“captain,” is flagged as a node that terminates a valid word in thedictionary. In addition, since “cap” is also a valid word, the thirdnode is also flagged as a node that terminates a valid word. The commonmisspelling prefix tree has strings that represent common misspellings.Each common misspelling is associated with one or more suggestions for acorrect spelling.

The spell checker compares a word with strings in the dictionary prefixtree. If the word does not match any valid word in the prefix tree, theword is identified as a possible misspelled word. Some strings such ascompound words, IP addresses, decimal numbers, hexadecimal numbers,etc., are not in dictionary but are valid strings since they are whatthe user has intended to represent. Some embodiments utilize a set ofrules to identify these strings and exclude them from being flagged asmisspelled words.

A hash table is an associative array, which is a data structure thatmaps keys to values. The look-up map implemented as a hash tableprovides an efficient search mechanism when the number of entries in thetable is small. Some embodiments prepare a list of suggestions formisspelled words. Each common misspelling is a key in the hash table andis associated with a set of values that are one or more strings used assuggestions to correct the common misspelling.

Some embodiments visually identify (e.g., underline) misspelled words.In some embodiments, a misspelled word is first compared with strings inthe common misspelling hash table (or in the common misspelling prefixtree) and if a match is found, the associated suggestions are added to alist of possible suggestions. Some embodiments edit the misspelled wordsby adding, replacing, or deleting characters at each character positionin a misspelled word in order to find correction suggestions. After eachedit to a misspelled word, the partial edit result is compared withstrings in the dictionary prefix tree and the last character edit isdiscarded if a search of the prefix tree indicates that no valid word inthe dictionary includes such a prefix string.

Each edit to the misspelled word that results in a valid string in thedictionary is added to a list of possible suggestions. The list entriesare then scored and a pre-determined number of suggestions with thehighest score are displayed when the user requests for correctionsuggestions for the misspelled word.

A user is provided with tools to request for suggestions. Someembodiments provide a novel user interface to present the list ofsuggestions to the user. In these embodiments, the user is not requiredto get a context menu by using the secondary selection tool of aselection device (what is commonly referred to as a right click andinvolves using the secondary selection option of device such as a mouseor a touchpad to display a pop up menu). Instead, the user interfacerecognizes a misspelled word and when the user uses the primaryselection tool of a selection device (e.g., the primary button of amouse) to select a misspelled word, the list of suggestions isdisplayed. The user is then provided with the option to select one ofthe suggestions or leave the word as is.

The preceding Summary is intended to serve as a brief introduction tosome embodiments of the invention. It is not meant to be an introductionor overview of all inventive subject matter disclosed in this document.The Detailed Description that follows and the Drawings that are referredto in the Detailed Description will further describe the embodimentsdescribed in the Summary as well as other embodiments. Accordingly, tounderstand all the embodiments described by this document, a full reviewof the Summary, Detailed Description and the Drawings is needed.Moreover, the claimed subject matters are not to be limited by theillustrative details in the Summary, Detailed Description and theDrawings, but rather are to be defined by the appended claims, becausethe claimed subject matters can be embodied in other specific formswithout departing from the spirit of the subject matters.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth in the appendedclaims. However, for purpose of explanation, several embodiments of theinvention are set forth in the following figures.

FIG. 1 conceptually illustrates a client-server system in someembodiments of the invention.

FIG. 2 conceptually illustrates a client device that does spell checkingfor browser-based applications according to prior art.

FIG. 3 shows the suggestions to correct the misspelled word “carr” whena first web-browser is used to access a browser-based application.

FIG. 4 shows the suggestions to correct the misspelled word “carr” whena second web-browser is used to access a browser-based application.

FIG. 5 conceptually illustrates the client-server system of FIG. 1 aftera browser-based application that manipulates text is downloaded from theserver.

FIG. 6 conceptually illustrates a portion of a dictionary prefix tree insome embodiments of the invention.

FIG. 7 conceptually illustrates the client device of FIG. 5 after theclient receives the encoded data structure for the dictionary and thelist of common misspellings from the server in some embodiments of theinvention.

FIG. 8 conceptually illustrates a portion of a prefix tree in someembodiments of the invention.

FIG. 9 conceptually illustrates an alternative embodiment of theclient-server system of FIG. 1 after a browser-based application thatmanipulates text is downloaded from the server.

FIG. 10 conceptually illustrates the client device of FIG. 9 after theclient receives the encoded data structure for the dictionary and thelist of common misspellings from the server.

FIG. 11 conceptually illustrates an alternative embodiment of theclient-server system of FIG. 1 after a browser-based application thatmanipulates text is downloaded from the server.

FIG. 12 conceptually illustrates the client device of FIG. 11 after theclient receives a browser compatible data file that includes thedictionary and the list of common misspellings from the server.

FIG. 13 conceptually illustrates a process for sending an applicationand associated files from a server to a client in some embodiments ofthe invention.

FIG. 14 conceptually illustrates a process for utilizing a web-basedapplication on a client device in some embodiments of the invention.

FIG. 15 illustrates an example of how a user requests an applicationfrom a server in some embodiments of the invention.

FIG. 16 conceptually illustrates a process for utilizing a web-basedapplication on a client device in some alternative embodiments of theinvention.

FIG. 17 conceptually illustrates a portion of a common misspellingsprefix tree in some embodiments of the invention.

FIG. 18 conceptually illustrates how a hash table is used as a look-upmap for common misspellings in some embodiments of the invention.

FIG. 19 conceptually illustrates a process for determining whether a setof typed words includes misspelled words in some embodiments of theinvention.

FIG. 20 illustrates a user interface in some embodiments of theinvention.

FIGS. 21A and 21B conceptually illustrate a process for identifyingcorrection suggestions for misspelled words in some embodiments of theinvention.

FIG. 22 conceptually illustrates a dictionary prefix tree in someembodiments.

FIG. 23 conceptually illustrates a process for displaying and applyingspelling suggestions in some embodiments of the invention.

FIG. 24 illustrates a portion of a graphical user interface in someembodiments of the inventions.

FIG. 25 conceptually illustrates a portion of a graphical user interfacein some embodiments of the inventions.

FIG. 26 conceptually illustrates an alternative process for displayingand applying spelling suggestions in some embodiments of the invention.

FIG. 27 conceptually illustrates a portion of a graphical user interfacein some embodiments of the inventions.

FIG. 28 conceptually illustrates a portion of a graphical user interfacein some embodiments of the inventions.

FIG. 29 conceptually illustrates the software architecture for buildingthe prefix trees in some embodiments of the invention.

FIG. 30 conceptually illustrates the software architecture fordetermining whether a word is misspelled.

FIG. 31 conceptually illustrates the software architecture for findingsuggestions for misspelled words.

FIG. 32 conceptually illustrates the software architecture fordisplaying suggestions and receiving user selection for misspelledwords.

FIG. 33 is an example of an architecture of a mobile computing device insome embodiments of the invention.

FIG. 34 conceptually illustrates an electronic system with which someembodiments of the invention are implemented.

DETAILED DESCRIPTION

In the following detailed description of the invention, numerousdetails, examples, and embodiments of the invention are set forth anddescribed. However, it will be clear and apparent to one skilled in theart that the invention is not limited to the embodiments set forth andthat the invention may be practiced without some of the specific detailsand examples discussed.

FIG. 1 conceptually illustrates a client-server system 100 in someembodiments of the invention. The system includes one or more serverdevices 105 and one or more client devices 110, one of each is shown forconvenience. The client 110 and the server 105 are connected through anetwork 125 such as the Internet. As shown, the server 105 includesseveral browser-based applications 115. Users of the client devices areprovided with access to these applications. The server also includesstorage for storing client data files 120.

Client devices 110 include one or more web browsers 130 such as Safari®,Internet Explorer®, Chrome®, Firefox®, etc. The users of the clientdevices access the applications 115 through a web browser. As shown inFIG. 1, client device 110 has requested access to browser-basedapplication 135 and has downloaded this application from the server 105.In some embodiments, the browser-based application 135 is run as abrowser application on the client.

The client devices create and/or download one or more data files 140from the server. Users of client devices manipulate the data files 140by using the browser-based application 135. Once the user is donemanipulating the data files, the data files are stored locally and/oruploaded to the server for future use.

Many applications such as text editors, spreadsheet generators,presentation software, etc., involve text manipulations and require amechanism for identifying spelling errors and providing spellingsuggestions. One option is to send each word to the server and utilizethe server's computing and storage resources to perform spell checking.This option, however, suffers from delays involved in sending the wordsto the server and getting the spelling errors and suggestions back fromthe server.

Another option is using spell checkers provided by browsers. Althoughmany browsers have embedded spell checkers, each browser provides adifferent user interface, a different dictionary, and a different set ofsuggestions. The user, therefore, may not have a uniform experience fordetecting and correcting the same errors when using different browsers.FIG. 2 conceptually illustrates a client device 205 that does spellchecking for browser-based applications according to prior art. Asshown, the client device 205 includes several browsers 210-225 (e.g.,Firefox®, Safari®, Internet Explorer®, and Chrome®, respectively).

In this example, a browser-based application 230 (e.g., Gmail®) isaccessed by one of the browsers. The user then uses a user-interfaceprovided by the browser-based application 230 to perform textmanipulation (e.g., to compose an email). Since each browser provides adifferent spell checker and a different user interface, the user willhave different spell checking experience when different browsers areused, even though the same browser-based application 230 is used tocorrect the same error.

FIG. 3 shows suggestions to correct the misspelled word “can” whenFirefox® web-browser 210 is used to access browser-based application230. The misspelled word 305 is underscored by a wave mark 310 and adrop down list 315 that includes 5 suggestions “Carr,” “care,” “cart,”“car,” and “an” is displayed. In addition, the drop down list 315includes a variety of other options 320, some of which might be greyedout and inaccessible because they are not applicable to correction ofthe spelling error.

FIG. 4 shows the suggestions to correct the same misspelled word “can”when Safari® web-browser 215 is used to access browser-based application230. The misspelled word 305 is underscored by a dotted line 410 and adrop down list 415 that includes 14 suggestions “car,” “charr,” “Carr,”“card,” “care,” “cart,” “carb,” “carl,” “carp,” “cars,” “cary,” “parr,”“carer,” and “carry” is displayed. In addition, the drop down list 415includes other options 420, some of which might be greyed out andinaccessible because they are not applicable to correction of thespelling error.

As is clearly shown in the examples of FIGS. 3 and 4, the user hascompletely different spell checking experiences when the sameapplication is accessed through different browsers. Since thebrowser-based application 230 uses the browser's spell checker, thenumber of suggested corrections, the other options provided in the dropdown lists, and even the way the misspelled word is underscored aredifferent.

Some embodiments of the invention do not use the browsers' spell checkerand provide a spell checker that is used across different browserplatforms and operating systems. In these embodiments, the spellchecking is performed by the same spell checker on the client side,which allows a uniform user experience regardless of which browser isutilized to access the application 135. Since the misspelled words arenot sent to the server to check, the network latencies are avoided andspell checking continues to function when the browser is disconnectedfrom the network (e.g., the Internet) or the backend server. In some ofthese embodiments, the spell checker is written in a language such asJavaScript, which is executable by browsers. For instance, in someembodiments, the spell checker is a part of browser-based application135, which is installed as a browser plug-in.

Spell checking in any language requires a list of the words that arecorrectly spelled in that language. The list may also require periodicupdates as new technical, scientific, and other terms are added to thelanguage. FIG. 5 conceptually illustrates the client-server system 100of FIG. 1 after a browser-based application 135 such as an editor, aspreadsheet application, or a presentation application that manipulatestext is downloaded from the server.

As shown, the server 105 maintains a spelling dictionary 505 and a setof common misspellings 510. The dictionary includes a list of thecorrect spelling for the words in the language (e.g., English) that isused by application 135 to perform spell checking. The set of commonmisspellings 510 includes a list of commonly misspelled words and theircorresponding correct spelling (or spellings). For instance, groups ofwords such as (allways, always), (almsot, almost), (alos, also),(borded, boarded, bordered, border) where a commonly misspelled wordsuch as allways is paired with the correct spelling “always” and acommonly misspelled word such as “borded” is paired with severalpossible correct spellings “boarded,” “bordered,” and “border.” Asshown, the server 105 creates data structure 530 from the dictionary 505and a data structure 525 from the common misspellings 510.

In some embodiments, these data structures are prefix trees. A prefixtree (or trie) is an ordered tree data structure where the keys arestrings. The position of a node in the tree defines the key with whichthe node is associated. In a prefix tree, all descendants of a node havea common prefix of the string associated with that node and the root isassociated with the empty string. FIG. 6 conceptually illustrates aportion 600 of a dictionary prefix tree in some embodiments of theinvention. As shown, each node in the tree other than the root node isassociated with a character. Furthermore, the root node 605 isassociated with empty string and each node in the tree is associatedwith a string that starts from the root node 605 and ends to that node.For instance, node 610 is associated with string “ba” that is not avalid English word. Other nodes such as nodes 615-635 are associatedwith valid English words “bad,” “bag,” “bagel”, “baggage,” and “baggy,”respectively. These nodes are conceptually marked with a black dot onthe tree to identify them as nodes associated with valid words. Someembodiments internally flag these nodes in the data structure thatrepents the prefix tree.

Some embodiments utilize the dictionary prefix tree not only todetermine whether a word is misspelled but also to provide suggestionsto correct the spelling. As described further below, some embodimentsadd, replace, or delete characters from a misspelled word in order toarrive to a word with correct spelling. These embodiments utilize thedictionary prefix tree to check the partial results after each add,replace, or delete to pursue only the changes that potentially result ina valid word.

Since the prefix trees are considerably large and would incur additionalnetwork bandwidth cost if the trees were sent in a raw format to theclient 110. Some embodiments encode and compress the prefix trees into amore compact format before sending the trees to the client. The encodingis designed to reduce the network transfer size, the memory footprint,and the workload on the client when the client decodes the trees. Asshown in FIG. 5, server 105 sends the encoded trees 525 and 530 to theclient 110 through the network 125.

FIG. 7 conceptually illustrates the client device of FIG. 5 after theclient 110 receives the encoded dictionary data structure 525 and theencoded common misspelling data structure 525 from the server. Theclient 110 decodes the encoded dictionary prefix tree 530 to build thedictionary prefix tree 520. The client 110 also decodes the encodedcommon misspelling prefix tree 525 to build the common misspellingprefix tree 515. Some embodiments delete the encoded prefix trees afterthe prefix trees are created. Some embodiments keep the encoded prefixtrees (or the decoded prefix trees) in browser cache on the clientdevice 110. In these embodiments, the cache is updated from the serveraccording to the browser rules for updating the cache. Therefore, aslong as the encoded prefix trees (or the decoded prefix trees) are notexpired in cache, the prefix trees are created (or reused) when theapplication is run in the browser without requiring the encoded prefixtrees to be downloaded from the server.

Within the prefix tree, every tree node has at least three pieces ofinformation: the character associated with the node, the node children,and a flag (or a bit) identifying whether the string associated with thenode is a valid word. To represent the tree structure, a nodeidentification (node ID) is assigned to each node in a breath firstmanner. Starting from the root tree node, the root node is given a nodeID of 0. The root children are then assigned node ID of 1, 2, . . . kfrom left to right, where k is the number of first-level node in thenode.

FIG. 8 conceptually illustrates a portion 800 of a prefix tree in someembodiments of the invention. As shown, the root node 805 has a node IDof 0 and the root node direct children (first level nodes) 810-830 havenode IDs 1-k, respectively. The assignment is then repeated with thesecond-level node from left to right, keeping the same order. As such,the children 835-845 of “node 1” 810 (the first child of the root node)have ID's k+1, k+2, . . . k+r, where r is the number of children ofnode 1. Similarly, the children of “node 2” 815 (the second child of theroot node) have ID's k+r+1, k+r+2, . . . k+r+s where s is the number ofchildren of node 2.

With this Node ID assignment scheme, all children of any node areautomatically grouped together. To represent all the parent-childrelationships in the prefix, for each node i, it suffices to only storethe node ID of the left-most child left_child_index[i] and the number ofchildren num_children[i]. Using the example above,left_child_index[0]=1, num_children[0]=k. Similarly,left_child_index[1]=k+1, num_children[1]=r, left_child_index[2]=k+r+1,num_children[2]=s, etc. Following this pattern, for every node i, theleft_child_index[i]+num_children[i]=left_child_index[i+1]. With thispremise some embodiments encode the whole tree structure by storing alist of number of children, ordered by the node ID. The left_child_indexarray can be recovered mathematically.

To minimize network transfer and take advantage that the number ofchildren of any node is reasonably capped, some embodiments encode eachentry in num_children with a simple ASCII character. For the charactersin the tree nodes, some embodiments concatenate all node characters,ordered by the node ID, into a single string to be sent to the client.The valid word bits are also concatenated accordingly. These bits arepacked into bytes (8 bits per byte), and encoded using the Base64encoding scheme in some embodiments. Base64 encoding represents binarydata in an ASCII string format by translating the data into a radix-64representation. The preceding technique produces three long strings ofASCII characters, which can be effectively compressed further with alossless data compression algorithm such the deflate algorithm.

On the client side, the characters, bits and left_child_index arerecovered in reverse steps. To further compress the amount of clientmemory in storing the left child indices, some embodiments encode thenumbers into two 16-bit characters and directly store them in a longstring. Empirically, this amounts to a 60% saving of the data sizecompared to storing it as a regular number array, while losing onlyminimal efficiency.

FIGS. 9 and 10 illustrate another alternative embodiment for sending thedictionary and the list of common misspellings from the server to theclient. FIG. 9 conceptually illustrates the client-server system 100 ofFIG. 1 after a browser-based application 135 such as an editor, aspreadsheet application, or a presentation application that manipulatestext is downloaded from the server. As shown, the server 105 maintains aspelling dictionary 505 and a set of common misspellings 510. As shownin FIG. 9, server 105 has encoded the dictionary prefix tree 520 andsends the encoded tree 530 to the client 110 through the network 125. Inthis embodiment, the server creates a look-up map 925 for the commonmisspelling list and sends the look-up map to the client. In someembodiments, the server also encodes and compresses the look-up mapprior to sending the look-up map to the client. In other embodiments,the server does not encode the look-up map.

FIG. 10 conceptually illustrates the client device of FIG. 9 after theclient 110 receives the encoded dictionary data structure 525 and thecommon misspelling data structure 925 from the server. The client 110decodes the encoded dictionary prefix tree 530 to build the dictionaryprefix tree 520. The client 110 uses the look-up map common misspellingprefix tree 925 to search for common misspellings. In the embodimentsthat the server encodes the look-up map, the client decodes (not shown)the look-up map. Some embodiments delete the encoded prefix tree and theencoded look-up map after the prefix trees are created. Some embodimentskeep the look-up map and the encoded prefix tree (or the decoded prefixtrees) in browser cache on the client device 110. In these embodiments,the cache is updated from the server according to the browser rules forupdating the cache. Therefore, as long as the look-up map and theencoded prefix tree (or the decoded prefix trees) are not expired incache, the look-up map is reused and the prefix tree is created (orreused) when the application is run in the browser without requiring thedata structures to be downloaded from the server.

FIGS. 11 and 12 illustrate yet another alternative embodiment forsending the dictionary and the list of common misspellings from theserver to the client. FIG. 11 conceptually illustrates the client-serversystem 100 of FIG. 1 after a browser-based application 135 such as aneditor, a spreadsheet application, or a presentation application thatmanipulates text is downloaded from the server. As shown, the server 105maintains a spelling dictionary 505 and a set of common misspellings510. The dictionary includes a list of the correct spelling for thewords in the language (e.g., English) that is used by application 135 toperform spell checking. As shown, the dictionary 505 and the commonmisspellings 510 are included in a browser compatible data file 1120such as a JavaScript file and are sent to the client 110.

FIG. 12 conceptually illustrates the client device of FIG. 11 after theclient receives a browser compatible data file that includes thedictionary and the list of common misspellings from the server. Thebrowser compatible data file 1115 is used to build a prefix tree 1220for the dictionary and a prefix tree 515 for the common misspellings. Inthe embodiments that utilize a look-up map instead of a prefix tree forcommon misspellings, the client extracts the look-up map from thebrowser compatible data file. In alternative embodiments, the clientreceives a list of common misspellings in the browser compatible fileand creates the look-up map at the client.

Some embodiments delete the browser compatible data file after theprefix trees are created. Some embodiments keep the browser compatibledata file (or the data for the dictionary and the common misspellingscontained in the browser compatible data file) in browser cache on theclient device 110. In these embodiments, the cache is updated from theserver according to the browser rules for updating the cache. Therefore,as long as the browser compatible data file (or the data for thedictionary and the common misspellings contained in the browsercompatible data file) are not expired in cache, the prefix trees arecreated when the application is run in the browser without requiring thebrowser compatible data file to be downloaded from the server.

Several more detailed embodiments of the invention are described insections below. Section I describes identifying spelling errors in aweb-based document creation and manipulation system in some embodiments.Next, Section II describes providing correction suggestions formisspelled words in some embodiments of the invention. Section IIIdescribes a novel user interface for displaying spelling suggestions andreceiving user selections. Section IV describes the softwarearchitecture of some embodiments. Finally, a description of anelectronic system with which some embodiments of the invention areimplemented is provided in Section V.

I. Identifying Spelling Errors in a Web-Based Document Creation andManipulation System

FIG. 13 conceptually illustrates a process 1300 for sending anapplication and associated files from a server to a client in someembodiments of the invention. As shown, the process receives (at 1305) arequest from a client device to use an application that utilizes spellchecking For instance, a user of the client device 110 in FIG. 1 usesone of the client browsers 130 to access a web page provided by theserver 105 and requests access to a word processing application. Next,process 1300 sends (at 1310) the requested application to the client.For instance, the server sends the application 135 in FIG. 1 as aplug-in or other browser executable form to the client.

Next, the process determines (at 1315) whether the client device isrequesting to receive a data file to be accessed by the application. Forinstance, the user of client device 110 in FIG. 1 uses the providedapplication 135 to edit a data file previously stored on the server. Ifthe client device is requesting a data file, the process sends (at 1335)the requested data file to the client. The process then proceeds to1315, which was described above. Otherwise, the process determines (at1320) whether the client device is requesting to receive the dictionaryand the list of common misspellings. For instance, when the user opensthe browser-based application 135, the application determines whethercopies of previously generated prefix trees for the dictionary and thelook-up map (or prefix tree) for the common misspellings are stillavailable in browser cache. If not, the application (through thebrowser) requests a copy of the dictionary and common misspelling filesfrom the server.

When the client application is not requesting to receive the dictionaryand common misspelling files, the process ends. Otherwise, at 1325, theprocess creates the data structures (or uses an existing version of eachdata structure if the information in the data structures are stillvalid) for the spell checking dictionary and common misspellings. Insome embodiments, the data structure for the spell checking is a prefixtrees. In some embodiments, the data structure for the commonmisspellings is also a prefix tree. In other embodiments, the datastructure for common misspellings is a look-up map. The process thenencodes the spell checking dictionary prefix tree. In the embodimentsthat utilize a prefix tree for common misspellings, the server alsoencodes the common misspellings prefix tree. In some embodiments, therequest from the client application also indicates the language used bythe user. Process 1300 utilizes the language information to send theproper dictionary and common misspelling files. The process then sends(at 1330) the encoded data structures to the client device. The processthen ends.

In the embodiments that build the prefix trees on the client side (asdescribed by reference to FIGS. 11 and 12, above), process 1300 replacesoperations 1320 and 1330 with the following operations. The processincludes (at 1325) the spell checking dictionary and a list of commonmisspellings in one or more browser compatible files (such as one ormore JavaScript files). The process then sends (at 1330) the browsercompatible file (or files) to the client device in order for the clientto build the corresponding data structures. The process then ends.

FIG. 14 conceptually illustrates a process 1400 for utilizing aweb-based application on a client device in some embodiments of theinvention. Process 1400 is utilized in embodiments described byreference to FIGS. 5 and 7, above. As shown, the process sends (at 1405)a request to a server to use an application that utilizes spellchecking. FIG. 15 illustrates an example of how a user requests anapplication from a server in some embodiments of the invention. In thisexample, the user of a client device has used a browser to access a webpage 1500 of a server that provides access to applications and remotefile storage. The web page 1500 is displayed after the user hassuccessfully signed up to the server.

As shown, the web page provides options 1510-1520 to the user to accessdifferent applications for text editing, spreadsheet generation, andpresentation generation, respectively. The user is also provided with anoption 1525 to retrieve files previously stored on the server and anoption 1530 to store new files on the server. When the user selects oneof the options 1510-1520, process 1400 sends a request to the server touse the corresponding application.

Next, process 1400 receives (at 1410) the application from the server.In some embodiments, the application includes a spell checker. In someof these embodiments the spell checker is written in a language such asJavaScript, which is executable by browsers and is installed, e.g., as abrowser application. The process then determines (at 1615) whether validdictionary and common misspelling files (e.g., the data structuresassociated with the corresponding prefix trees or look-up map) exist inthe client. For instance, if the files are kept in browser cache, theprocess determines that the files are still valid. If valid copiesexist, the process proceeds to 1430, which is described below.Otherwise, the process sends (at 1420) a request to the server toreceive a copy of the data structures (e.g., the encoded prefix tree orlook-up map) for the dictionary and the common misspelling files. Therequest also identifies the language used by the application to edittext editing and perform spell checking.

The process then receives (at 1425) data structures (e.g., encodedprefix trees or look-up map) associated with the spell checkingdictionary and the list of common misspellings. For instance, process1300 in the server generates the prefix trees or look-up map files (oridentifies existing files) and encodes them as described by reference toFIG. 8, above.

Process 1400 then decodes (at 1430) the received dictionary datastructure (e.g., encoded dictionary prefix tree) to build the dictionaryprefix tree. As described by reference to FIG. 6, above, a prefix treeis an ordered tree data structure where the keys are strings. Theposition of a node in the tree defines the key with which the node isassociated. In a prefix tree, all descendants of a node have a commonprefix of the string associated with that node and the root isassociated with the empty string. The process decodes (at 145) thereceived common misspelling data structure (e.g., encoded commonmisspelling prefix tree or look-up map) to build the common misspellingprefix tree. In the embodiments that the server sends the look-up to theclient without encoding the look-up map, operation 1435 is bypassed. Theprocess then ends.

FIG. 16 conceptually illustrates a process 1600 in an alternativeembodiment for utilizing a web-based application on a client device insome embodiments of the invention. Process 1600 is utilized inembodiments described by reference to FIGS. 11 and 12, above. As shown,the process sends (at 1605) a request to a server to use an applicationthat utilizes spell checking. Next, process 1600 receives (at 1610) theapplication from the server. In some embodiments, the applicationincludes a spell checker. In some of these embodiments the spell checkeris written in a language such as JavaScript, which is executable bybrowsers and is installed, e.g., as a browser application.

The process then determines (at 1615) whether a valid dictionary andcommon misspelling files (or a valid copy of a browser compatible filethat includes dictionary and common misspelling data) exists in theclient. For instance, if the files are kept in browser cache, theprocess determines that the files are still valid. If valid copiesexist, the process proceeds to 1630, which is described below.Otherwise, the process sends (at 1620) a request to the server toreceive a copy of the dictionary and the common misspelling files. Therequest also identifies the language used by the application to edittext editing and perform spell checking.

The process then receives (at 1625) a spell checking dictionary and alist of common misspellings in one or more browser compatible files. Forinstance, process 1300 in the server generates a script file andincludes the dictionary listing and the common misspellings in thescript file. Using a compact browser compatible format such asJavaScript allows for fast transfer of the file from the server to theclient.

Process 1600 then builds (at 1630) a prefix tree from the dictionarylisting. As described by reference to FIG. 6, above, a prefix tree is anordered tree data structure where the keys are strings. The position ofa node in the tree defines the key with which the node is associated. Ina prefix tree, all descendants of a node have a common prefix of thestring associated with that node and the root is associated with theempty string. The process then builds (at 1635) a prefix tress from thecommon misspelling listing. In the embodiments that utilize a look-upmap, operation 1635 creates the look-up map from the received commonmisspelling list. In the embodiments that the client receives thelook-up map from the server, operation 1635 is bypassed. The processthen ends.

Different embodiments described in this specification utilize differentdata structures for the common misspellings. Some embodiments utilize aprefix tree. Other embodiments utilize a look-up map. FIG. 17conceptually illustrates a portion 1700 of a common misspellings prefixtree in some embodiments of the invention. As shown, the root node 1705is associated with an empty string. In this example, strings for fourcommon misspellings “ade,” “addres,” “adress,” and “ador” are shown. Theleaf nodes 1710-1725 point to correct spelling for each string. Asshown, the correct spelling for both strings “addres” and “adress” is“address.” On the other hand, there are two suggestions, “adore” and“adorn,” for the misspelled string “ador.” As described further below,the common misspelling prefix tree is utilized to quickly identifywhether a typed word is a common misspelling and to provide one or moresuggestions for the misspelled word.

FIG. 18 conceptually illustrates how a hash table is used as a look-upmap for common misspellings in some embodiments of the invention. Asshown, the keys 1805 for the hash table are common misspellings. In thisexample, strings 1825-1840 for four common misspellings “ade,” “addres,”“adress,” and “ador” are shown, respectively. The hash function 1810maps each key to a bucket or slot. The buckets 1815 point to strings1820 for suggestions. As shown, the common misspellings addres 1825 andadress 1840 are both mapped to the suggestion 1850. The commonmisspelling ade 1830 is mapped to suggestion 1845. The commonmisspelling ador 1835 is mapped to a bucket 1870, which point to twosuggestions 1855 and 1860. Using a common misspelling as a key,therefore, results in one or more suggestions. Searching for a stringthat is not a common misspelling results in the look-up map to return anexception (such as a null value) to indicate that the string is notfound in the look-up map.

FIG. 19 conceptually illustrates a process 1900 for determining whethera set of typed words includes misspelled words in some embodiments ofthe invention. Process 1900 is utilized, e.g., to determine whether afile that includes text has misspelling. The process is also used asindividual words are typed to determine whether the word is misspelled(in this case the number of words in the set is one).

As shown, the process receives (at 1905) a set of words to spell check.The process then identifies and sets (at 1910) the first word in the setas the current word. For instance, the process scans (or parses) thecharacters in a text string and identifies terminators such aspunctuation marks that delimit words in sentences and phrases.

The process then determines (at 1915) whether the current word matches astring in the prefix dictionary that is identified as a valid word. Forinstance, assume that the typed string is “bag” and the dictionaryprefix tree is prefix tree 600 shown in FIG. 6. A search of the treeresults a match between the typed word and the string terminated at node620. Since node 620 is marked as the last node for a valid string (asconceptually shown with a black dot), process 1900 determines that thestring “bag” is a correctly spelled word. On the other hand, assume thatthe typed string is “bagg.” Although this string matches the stringterminated at node 640, the string is not identified as a valid word bythe prefix tree since the node 640 is not identified as the last nodefor a valid string. When the process determines that the current word isa valid word, the process proceeds to 1955, which is described below.

Otherwise, the process determines (at 1920) whether the string is acompound word. FIG. 20 illustrates a user interface 2000 in someembodiments of the invention. The user interface shows a page of adocument and includes several examples of the strings that are not inthe dictionary but are considered valid words and are not marked asmisspelled words in some embodiments of the invention. The page alsoincludes a misspelled string “addres” 2025, which is underlined tovisually indicate that the string is misspelled.

As shown, the string “BestBikeRentalInTown” 2005 is not underlined as amisspelled word. The string includes multiple uppercase (or capitalletters) that separate several valid words. This type of string isreferred to as a camel case where the string includes inner uppercaseletters. In this example, valid strings “Best,” “Bike,” “Rental,” “In,”and “Town” are joined without spaces. Many product names are written ascamel case. In addition, some embodiments consider a hyphenated compoundof valid word to be a valid compound word. Process 1900, therefore,identifies compound words (such as camel case strings) and does not markthem as invalid word. When the string is not a compound word, theprocess proceeds to 1945, which is described below.

Otherwise, the process breaks (at 1930) the compound word intoindividual words. The process then adds (at 1935) the individual wordsto the set of words for further spell checking The process then sets thefirst individual word in the compound word as the current word. Theprocess then proceeds to 1920, which was described above.

The process determines (at 1945) whether the current word as typed is avalid word based on a set of rules that identify words that are not inthe dictionary but are intended to be valid as typed. For instance, IPaddresses are widely used in modern technical text and are not generallyfound in the dictionary but are nevertheless valid strings since theuser intends to type them in predetermined formats (e.g., 192.168.1.1).In the example of FIG. 20, the IP address 2020 is not underlined as thestring is recognized as a valid string. Similarly, Internet domain namesare widely used and are not generally found in the dictionary but areconsidered valid strings since the user intends to type them in one ormore predetermined formats (e.g., www.xyz.com or https://xyz.com, etc.).Some embodiments include a set of rule to identify different stringsthat are not in the dictionary but, based on their format or thesequence of characters, are considered valid as typed.

For instance, the set of rules in some embodiments identifyabbreviations, prefixes, postfixes, such as “ing,” “ed,” “'s,” etc., asvalid and remove them prior to spell checking a word. For instance, thestring “We're” 2015 is not marked as misspelled in FIG. 20. Someembodiments consider a valid word followed by an apostrophe followed bya suffix such as “s,” “ed,” “er,” “ll,” “ye,” “ing,” etc., to also be avalid word.

The set of rules, in some embodiments, also exclude words with decimalpoints, hexadecimal and octal numbers, etc. For instance, in FIG. 20,string “7cdea” is recognized as a valid hexadecimal number and is notmarked as misspelled. In addition, some embodiments consider a wordconsisting entirely of hexadecimal digits with or without being precededby “ox” as a valid word. Some embodiments consider a word consistingentirely of decimal digits as a valid word. Some embodiments alsorecognize a word that includes at most 6 uppercase letters plusoptionally some decimal point (or plus optionally some decimal digits ineven positions and periods in odd positions) as a valid word.

When process 1900 determines that a string is valid as typed based onthe spell checking rules, the process proceeds to 1955, which isdescribed below. Otherwise, the process marks (at 1950) the string asmisspelled. A misspelled word is visually identified (e.g., asunderlined) in some embodiments. As shown in FIG. 20, the string“addres” 2025 is underlined to be visually identified as a misspelledword.

The process then determines (at 1955) whether spell checking for allwords in the set is completed. If so, the process ends. Otherwise, theprocess sets (at 1960) the next word in the set as the current word. Theprocess then proceeds to 1920, which was described above.

II. Providing Corrections Suggestions for Misspelled Words

After a word is identified as a misspelled word, some embodimentsprovide a list of suggestions to correct the misspelling. In someembodiments, as soon as a word is identified as a misspelled word, alist of suggestions is prepared and stored. Once the user requests forsuggestions, the list is displayed. In other embodiments, the list isprepared and displayed after the user requests for suggestions.

Some embodiments provide up to a predetermined number of suggestions foreach misspelled word. For instance, when the predetermined number is setto three and the number of suggestions is three or less, all suggestionsare displayed. On the other hand, when the number of suggestions is morethan three, then the suggestions are scored and only the top threescored suggestions are displayed. Different criteria for scoring thesuggestions are described further below.

FIGS. 21A and 21B conceptually illustrates a process 2100 foridentifying correction suggestions for misspelled words in someembodiments of the invention. As shown, the process receives (at 2105) amisspelled word. In some embodiments when process 1900 identifies amisspelled word, process 2100 receives the word in order to determine aset of suggestions and store them to display to the user upon request.In other embodiments, the suggestions are determined only after the userrequests for suggestions. In these embodiments, process 2100 receivesthe misspelled word after the user requests for suggestions.

Next, process 2100 determines (at 2110) whether the word is a commonmisspelling (e.g., by using the look-up map or searching the commonmisspelling prefix tree). For instance, if the word is “addres” and thecommon misspelling prefix tree is as shown in FIG. 17 the word matchesthe string terminated to node 1715 and the corrections suggestion is“address” (as pointed to by node 1715). On the other hand, if the wordis “ador,” there are two suggestions: “adore” and “adorn,” as pointed bynode 1725. A similar result is achieved for the embodiments that utilizea look-up map (e.g., as shown in FIG. 18) for the common misspellings.

When the string does not match any common misspelling, the processproceeds to 2125, which is described below. Otherwise, when the stringmatches one or more common misspelling strings that point to correctionsuggestions, the suggestions are scored (at 2115) and are added (at2120) to a list of possible suggestions. Some embodiments givesuggestions provided by the common misspelling list the highest scoreand always provide them to the user. Other embodiments give high scoresto the suggestions provided by the common misspelling list but comparethe scores with the scores for other suggestions found by editing themisspelled word as described below.

In order to find more suggestions, the process edits the misspelled wordby adding characters to the word or by replacing or deleting thecharacters in the word. A naive, and commonly used, approach would be toalter the input string first, generating a large amount of candidatesand then to verify each of those candidates against a dictionary. Incontrast, some embodiments utilize a simple traversal of the prefix treeto only extract correct words similar to the string in question. Thisprocess often generates the same suggestion more than once. The numberof times each of the suggestions was generated serves as a score thatultimately determines which suggestions to display to the user. Bybuilding up words incrementally, amount of vocabulary to search throughis limited. An important benefit of using this approach is keepingstring manipulation to a minimum, which is key to good performance inbrowser-based application written in e.g., JavaScript. Theseenhancements allow a user to get fast feedback and correctionsuggestions.

Some embodiments allow a predetermined number of edits for eachcharacter positions. For instance, when the misspelled word is “peopel,”the first character can be changed from “p” to any other characters. Forinstance, one possible edit is changing the first “p” to “t” results in“teopel.” This is referred to as 1 edit-distance. A second possible editfor the first character position is to add another character either tothe left of the right of the first character. This is referred to as 2edit-distance. Some possible 2 edit-distance edits for the firstcharacter are “ateopel,” “taeopel,” “bteopel,” “tbeopel,” etc.

After each add, replace, or delete, the edited word is checked againstthe strings in the dictionary prefix tree to find a match and onlychanges for which a match is found are further pursued. A match is foundwhen the edited portion of the misspelled word (i.e., the string in theedited misspelled word starting from the first character to the lastadded/replaced/deleted character) matches a string in the prefix treestarting from a first node after the root of the tree all the way toeither a middle node or a leaf node that terminates a valid word. Thelast character edit is discarded if searching the prefix tree does notresult in a match with a valid word. The edited word is further editedby adding, replacing, or deleting other characters in the edited word ifthe partial result matches any strings in the prefix tree. The processcontinues until the edited word (from the first character to the lastcharacter in word) matches a string associated to a valid word in thetree.

The following example clarifies how suggestions are found in someembodiments of the invention. FIG. 22 conceptually illustrates adictionary prefix tree in some embodiments of the invention. Forsimplicity, assume that the universe of valid words is limited to thestrings shown in FIG. 22. The tree indicates that the valid words are“cap,” “car,” “coop,” “captain,” and “capture.” The terminating nodes2225-2245 are conceptually marked with a black dot in the figure toidentify the strings that correspond to valid words. Some embodimentsdefine a data structure for the prefix tree and flag the nodes thatcorrespond to the last character in valid words.

When the misspelled word is “cbp,” there is no point to change “c” toany other characters, since the resulting words do not match any stringsin the prefix tree (as in this hypothetical example all valid wordsstart with “c”). Also, there is no point to add any characters before“c” since in this example no valid word has “c” in second characterposition.

However, replacing “b” with “a” results in a match between “ca” (whichis the string in the edited misspelled word starting from the firstcharacter to the last added/modified/deleted character) and the stringstarting from the first node 2210 after the root node 2205 of the treeand ending to the node 2215. Although this string is not associated witha valid word (as node 2215 is not marked with a black dot), the matchcreates the possibility for the misspelled word to further be edited toresult in a valid word (e.g., “cap” or “car”). Similarly, replacing “b”to “o” results in a match between “co” and the string that starts onnode 2210 and ends to node 2220.

Changes to the misspelled word “cbp” continues until a change results toa valid string such as “cap,” “car,” or “coop.” Some of the validstrings can be reached multiple times. For instance, editing “cbp”results in “coop” if a character is added after “c” and “b” is changedto “o.” Editing “cbp” also results in “coop” if “b” is changed to “o,”“p” is changed to “o,” and a “p” is after the last character. Someembodiments assign a score to a suggestion depending on how easy amisspelled word can be edited to reach the suggestion (i.e., how manycharacter has to be added/modified/deleted) or how many times editingthe misspelled word results on the same suggestion. The easier amisspelled word can be edited to reach a suggestion and the more waysthat editing the misspelled word results to the same suggestion, thehigher is the score for that suggestion.

Referring back to FIG. 21A, process 2100 sets (at 2125) the firstcharacter position in the misspelled word as the current characterposition. The process then edits (at 2130) the misspelled word byreplacing/adding/deleting characters at the current character position(e.g., as described above by reference to FIG. 22). The process thendetermines (at 2135) whether the edited string, starting from the firstcharacter of the misspelled word to up to the lastreplaced/added/deleted character matches any strings in the dictionaryprefix tree (e.g., as described above by reference to FIG. 22). If yes,the process proceeds to 2140, which is described below.

Otherwise, the process discards (at 2145) the last change since there isno chance that any further change results in a valid string. The processthen determines (at 2150) whether all possible replace/add/deleteoptions for the current character examined. If not, the process proceedsto 2125, which was described above. Otherwise, the process determines(at 2155) whether all character positions in the misspelled word edited.If not, the process sets (at 2160) the next character position in themisspelled word as the current character position. The process thenproceeds to 2130, which was described above. Otherwise, when allcharacter positions are examined, the process proceeds to 2180 to scorethe suggestions as described below.

When the edited string matches a string in the dictionary prefix tree,the process determines (at 2140) whether the current character positionis the last position in the word. If no, the process proceeds to 2160 toreplace/add/delete the next character as described above.

Otherwise, the process determines (at 2165) whether the string in thedictionary prefix tree that match the edited word corresponds to a validword. In the example of FIG. 22, a final edited word of “cap” matchesthe string “cap” in the prefix tree. Since the string “cap” in the treecorresponds to a valid word (as conceptually illustrated by the blackdot on node 2225), the final edited word is a valid word. On the otherhand, the final word can be “capt” which is reached by editing themisspelled word “cbp” to replace “b” with “a,” and adding “t” to the endof the last character. This string matches the string that starts withnode 2210 and ends to node 2250. However, node 2250 does not terminate avalid string. The edited word “capt” is therefore not a valid word.Since process 2100 reaches the decision point 2165 after all edits aredone to the word, the string “capt” cannot be used as a valid correctionsuggestion.

Accordingly, when the string in the dictionary prefix tree that matchthe edited word does not correspond to a valid word, the processproceeds to 2145 to discard the word as described above. Otherwise, theprocess adds (at 2170) the word to the list of the candidatesuggestions. The process then determines (at 2175) whether all possiblereplace/add/delete options for all character positions of the misspelledword examined. If not, the process proceeds to 2125 to find anothercorrection suggestion, as described above. When all possible changes tothe misspelled word are examined, the process scores (at 2180) thesuggestions in the list of candidate suggestions.

For instance, a suggestion that is found in the list of commonmisspellings gets the highest point in some embodiments. Someembodiments also score the candidate suggestions by the number of waysthat a misspelled word is transformed into a candidate suggestion bydoing different 2 edit-distance operations. The suggestions in someembodiments also receive higher scores when they are reached with fewermodifications to the misspelled word or when the same suggestion isreached through different modifications to the misspelled word. Theprocess then optionally saves (at 2185) a predetermined number ofsuggestions with highest scores to display to the user upon request. Theprocess then ends.

III. User Interface for Providing Spelling Suggestions and ReceivingUser Selections

Some embodiments provide a novel graphical user interface metaphor forproviding suggestions and receiving a user selection of one of thesuggestions. FIG. 23 conceptually illustrates a process 2300 fordisplaying and applying spelling suggestions in some embodiments of theinvention. Process 2300 is described by reference to FIGS. 24 and 25. Asshown in FIG. 23, process 2300 receives (at 2305) a request forproviding correction suggestions for a misspelled word. In someembodiments, the process receives a request for suggestion when a userplaces a selection tool over a misspelled word and applies a firstselection operation (or an activation operation). An example of suchselection operation in some embodiments is when a locating tool (e.g.,the cursor) is placed over a misspelled word and a selection tool suchas the primary selection button of the mouse is pressed and held (i.e.,activated) without releasing the button.

The process then displays (at 2310) a pre-determined number ofcorrection suggestions for the misspelled word. In some embodiments,process 2300 retrieves the suggestions that were identified and storedby process 2100 and displays the suggestions. In other embodiments,process 2300 activates process 2100 after the request for correctionsuggestion is received in order to identify the suggestions.

FIG. 24 conceptually illustrates a portion of a graphical user interfacein some embodiments of the inventions. The graphical user interface isshown in four stages 2401-2404. In stage 2401, the graphical userinterface shows a portion of a document that includes the misspelledstring “addres” 2495. String 2495 is visually identified (in thisexampled underlined) as a misspelled word. As shown, in this stage thelocation tool (e.g., the cursor) is away from the misspelled string, asconceptually shown by the vertical symbol caret 2410.

As shown in stage 2402, the user places the locating tool over (or inclose vicinity of) the misspelled word 2495, and utilizes a selectiontool such as the primary button of a mouse to make a first selectionoperation. In some embodiments, the selection tool is a primaryselection tool such as the primary (usually the left) button of a mouseor an equivalent tool on a touchpad or tracking ball. For instance, thefirst selection operation is pressing down on the primary mouse buttonand holding the button. This is conceptually shown by the black arrow2480 to indicate that the primary selection tool on a selection devicesuch as mouse, touchpad, etc., is activated but not released. As shownin stage 2402, a list 2415 of one or more suggestions for the misspelledword is displayed after the first selection operation is received. Asshown, the list 2415 includes three suggestions “address,” “adders,” and“adores.”

One advantage of using the primary selection button over the secondarybutton is that the embodiments described in this specification functionthe same regardless of whether the selection tool has only one button ormore than one button. In contrast, if displaying the suggestion listrequires the use of the secondary selection button (e.g. if it requiresa right click in a two button mouse), then the user either will not beable to use a selection tool with only one button or the user has to usemenus or other techniques to do spell correction. Also, as shown, theuser interface is different than the traditional desktop applications inwhich a single left click over the misspelled word moves the locatingtool's caret 2410 to where the user has clicked while a single rightclick pops up the context-sensitive menu that contains correctionsuggestions.

Referring back to FIG. 23, process 2300 then receives (at 2315) an inputfrom the user interface. The process then determines (at 2320) whetherthe input is a directional change to a location tool such as the cursor.If so, the process changes (at 2330) the position of the selection tool(i.e., selection tool 2480 in FIG. 24) and proceeds back to 2315 toreceive another input. When the input is not a directional change, theprocess determines (at 2335) whether the input is a second operationreceived from the same selection tool. For instance, when the firstoperation was received from the primary button of a mouse, the processdetermines whether the input is received from the same primary button.If not, the process proceeds to 2365, which is described below.

Otherwise, the process determines (at 2340) whether the second operationwas received to select one of the displayed suggestions. For instance,the process determines whether the primary selection tool was releasedover one of the displayed suggestions. If not, the process proceeds to2350, which is described below. Otherwise, the process replaces (at2345) the misspelled word with the selected suggestion. The process thenremoves (at 2360) the displayed suggestions. The process then ends.

As shown in stage 2403 in FIG. 24, the user has moved the selection tool(e.g., by a dragging operation on a mouse or on a touchpad) over one ofthe suggestions (over the string “address” 2420). The user then makesthe selection through the pointing device (e.g., by releasing theprimary selection button on the mouse as is conceptually shown by thewhite arrow 2485). In stage 2404, the misspelled word “addres” 2495 isreplaced by the selected suggestion “address” 2425.

When process 2300 determines that the second operation was not to selectone of the suggestions, the process determines (at 2350) whether thesecond operations is to select text that includes all or a portion ofthe misspelled word (e.g., a portion of the misspelled word, a paragraphor a sentence that includes the misspelled word, etc.). If not, theprocess proceeds to 2365, which is described below. Otherwise, theprocess selects (at 2355) text that includes all or a portion of themisspelled word based on the second operation (e.g., as described byreference to FIG. 25, below). In some embodiments, such a selectionincludes selecting all characters starting from the character over whichthe first operation (at 2310) was received up to and including thecharacter over which the second operation (at 2335) was received. Theprocess then proceeds to 2360, which was described above.

The process removes (at 2365) the displayed suggestions. The processalso discards (at 2370) the previously received first operation. Theprocess then performs (at 2375) the received input. The process thenends.

FIG. 25 conceptually illustrates a portion of a graphical user interfacein some embodiments of the inventions. The graphical user interface isdescribed in four stages. The first two stages are the same as stages2401 and 2402, shown in FIG. 24, and are not shown for clarity. In stage2405 of FIG. 25, instead of selecting one of the displayed suggestions2415, the user has started with pressing the primary selection buttonover the second “d” character and has moved the selection tool acrossthe misspelled word and has released the primary selection button overthe character “s” (as shown in the enlarged portion 2505). In stage2406, as shown in the enlarged portion 2510, a portion of the string isselected and visually identified by highlight 2515. Also, thesuggestions 2415 are removed from the display.

Since the same primary button is used to either select one of thesuggestions or perform other operations, a suggestion is only selectedwhen the primary button is released over one of the displayedsuggestions. Alternatively, in the embodiment displayed in FIGS. 24 and25, the user can press and release the primary button twice over themisspelled word in quick succession (commonly referred to as doubleclick) to select the whole misspelled word instead of getting thesuggestions 2415 displayed. Yet, the user can press and release theprimary button over the misspelled word three times in quick successionto select the whole paragraph that includes the misspelled word.

ALTERNATIVE EMBODIMENTS

The embodiments described in FIG. 23-25 use a single action of theprimary selection tool (e.g., press down and hold of the primary mousebutton) over a misspelled word to display the list of suggestions andanother single action on the same primary button (e.g., release of theprimary mouse button) to select one of the displayed options.

In some alternative embodiments, pressing and releasing the primarybutton over the misspelled word displays the list of suggestions insteadof selecting the word. FIG. 26 conceptually illustrates an alternativeprocess 2600 for displaying and applying spelling suggestions in someembodiments of the invention. Process 2600 is described by reference toFIGS. 27-28. As shown in FIG. 26, process 2600 receives (at 2605) afirst set of operations from a selection tool to request for correctionsuggestions for a misspelled word. In some embodiments, the processreceives a request for suggestion when a user places a selection toolover a misspelled word and applies a first set of selection operations.An example of such set of selection operations in some embodiments iswhen a locating tool (e.g., the cursor) is placed over a misspelled wordand a selection tool such as the primary selection button of the mouseis pressed and released.

The process then displays (at 2610) a pre-determined number ofcorrection suggestions for the misspelled word. In some embodiments,process 2600 retrieves the suggestions that were identified and storedby process 2100 and displays the suggestions. In other embodiments,process 2600 activates process 2100 after the request for correctionsuggestion is received in order to identify the suggestions.

FIG. 27 conceptually illustrates a portion of a graphical user interfacein some embodiments of the inventions. The graphical user interface isshown in four stages 2701-2704. In stage 2701, the graphical userinterface shows a portion of a document that includes the misspelledstring “addres” 2495. String 2495 is visually identified (in thisexampled underlined) as a misspelled word. As shown, in this stage thelocation tool (e.g., the cursor) is away from the misspelled string, asconceptually shown by the vertical symbol 2410.

As shown in stage 2702, the user places the locating tool over (or inclose vicinity of) the misspelled word 2495, and utilizes a selectiontool such as the primary button of a mouse to make a first set ofselection operations. In some embodiments, the selection tool is aprimary selection tool such as the primary (usually the left) button ofa mouse or an equivalent tool on a touchpad or tracking ball. Forinstance, the first set of selection operations is pressing down on theprimary mouse button and releasing the button. This is conceptuallyshown by the pair of black and white arrows 2780 to indicate that theprimary selection tool on a selection device such as mouse, touchpad,etc., is activated and released. As shown in stage 2702, a list 2415 ofone or more suggestions for the misspelled word is displayed after thefirst selection operation is received. As shown, the list 2415 includesthree suggestions “address,” “adders,” and “adores.”

Referring back to FIG. 26, process 2600 then receives (at 2615) a set ofinputs from the user interface. The process then determines (at 2620)whether the set of inputs is for directional change to a location toolsuch as the cursor. If so, the process changes (at 2630) the position ofthe location tool (i.e., location tool 2410 in FIG. 27) and proceedsback to 2615 to receive another input. When the input is not adirectional change, the process determines (at 2635) whether the set ofinputs is a second set of operations received from the same selectiontool to select one of the displayed suggestions. For instance, when thefirst operation was received from the primary button of a mouse, theprocess determines whether the set of inputs is a press down followed byrelease received from the same primary button. If not, the processproceeds to 2665, which is described below.

Otherwise, the process replaces (at 2645) the misspelled word with theselected suggestion. The process then removes (at 2660) the displayedsuggestions. The process then ends.

As shown in stage 2703 in FIG. 27, the user has moved the selection tool(e.g., by a dragging operation on a mouse or on a touchpad) over one ofthe suggestions (over the string “address” 2420). The user then makesthe selection through the pointing device (e.g., by pressing down andreleasing the primary selection button on the mouse as is conceptuallyshown by the pair of back and white arrows 2785). In stage 2704, themisspelled word “addres” 2495 is replaced by the selected suggestion“address” 2425.

When process 2600 determines that the second operation was not to selectone of the suggestions, the process removes (at 2665) the displayedsuggestions. The process also discards (at 2670) the previously receivedfirst operation. The process then performs (at 2675) the received set ofinputs. The process then ends.

In yet other embodiments, the set of operations received (at 2605) ispressing and releasing the primary selection button twice over themisspelled word in quick succession (e.g., a double click on the mouseprimary button) to display the suggestions. In these embodiments, thesecond set of input operations received (at 2635) is another press andrelease of the primary selection button to select one of the displayedsuggestions.

In other alternative embodiments, the list of suggestions is displayedas soon as the location tool (e.g., the cursor) is placed over (commonlyreferred to as hovered over) the misspelled word. In these embodiments,process 2600 receives (at 2605) an indication that the location tool isplaced over the misspelled word. Other operations of process 2600 arethe same as described above. FIG. 28 conceptually illustrates a portionof a graphical user interface in some embodiments of the inventions. Thegraphical user interface is shown in four stages 2801-2804. In stage2801, the graphical user interface shows a portion of a document thatincludes the misspelled string “addres” 2495. String 2495 is visuallyidentified (in this exampled underlined) as a misspelled word. As shown,in this stage the location tool (e.g., the cursor) is away from themisspelled string, as conceptually shown by the vertical symbol 2410.

As shown in stage 2802, the user moves the locating tool from a location2870 (which is away from the misspelled word 2495) and places thelocation tool at a location 2880 over the misspelled word 2495 (i.e.hovers the location tool over the misspelled word). As shown in stage2802, a list 2815 of one or more suggestions for the misspelled word isdisplayed after the first selection operation is received. As shown, thelist 2815 includes three suggestions “address,” “adders,” and “adores.”The list 2815 is another style of displaying the suggestions. The list2415, shown in FIGS. 24 and 27 displays the suggestions as a verticallist. The list 2815 displays the suggestions as a horizontal list eitherover the misspelled word (as shown) or under the misspelled word (notshown) depending on the available space. Either vertical or horizontaldisplay of the suggestions can be used in any of the embodimentsdescribed in this Specification.

As shown in stage 2803 in FIG. 28, the user has selected the string“address” 2420 through the pointing device (e.g., by pressing down andreleasing the primary selection button on the mouse as is conceptuallyshown by the pair of back and white arrows 2885). In stage 2804, themisspelled word “addres” 2495 is replaced by the selected suggestion“address” 2425.

IV. Software Architecture

In some embodiments, the spell checking processes described above areimplemented as software running on a particular machine, such as acomputer, a media player, a touchpad, a cell phone, or other handhold orresource limited devices (or stored in a machine readable medium). FIGS.29-32 conceptually illustrate the software architecture of differentcomponents of a spell checking application of some embodiments.

FIG. 29 conceptually illustrates the software architecture for buildingthe dictionary and common misspellings in some embodiments of theinvention. The Prefix Tree Builder component 2900 includes a DictionaryData Structure Builder module 2910 for building the dictionary datastructure (e.g., the dictionary prefix tree) 2950. In the embodimentsthat the data structures are built in the client device, module 2910extracts dictionary words from the dictionary and common misspellingfile 2960, which is received from the server. The Data Structure Buildercomponent 2900 also includes a Common Misspellings Data StructureBuilder module 2920 for extracting common misspellings list from thedictionary and common misspelling file 2960 and building the commonmisspelling prefix tree (or the look-up map). The Data Structure Buildercomponent 2900 in some embodiments deletes the dictionary and commonmisspelling file 2960 after the prefix trees are built. In theembodiments that the server builds the data structures and sends them tothe client, the Dictionary Data Structure Builder 2910 and CommonMisspelling Data Structure Builder 2920 modules decode the datsstructures received from the server.

FIG. 30 conceptually illustrates the software architecture fordetermining whether a word is misspelled. The Spelling Verifiercomponent 3000 includes a Word Parser module 3005, a Dictionary PrefixTree Parser module 3010, a Compound Word Identified module 3015, and aRule Applier module 3020. The Word Parser module 3005 searches for worddelimiters and identifies words in a document. The Word Parser module3005 receives the words either as a user types the words or receives thewords stored in user data files 3060.

The Dictionary Prefix Tree Parser module 3010 receives the wordsidentified by the Word Parser module 3005 and searches the dictionaryprefix tree 2950 to determine whether a word is in the dictionary. TheCompound Word Identifier module 3015 identifies compound words. The RuleApplier module 3020 applies a set of rules to determine whether a wordis valid as typed by the user. For instance, whether the word is an IPaddress or a hexadecimal number.

FIG. 31 conceptually illustrates the software architecture for findingsuggestions for misspelled words. The Suggestion Finder component 3100includes a Common Misspelling Prefix Tree Parser module 3105, aMisspelled Word Character Editor module 3110, and a Suggestion Scorermodule 3115. The suggestion Finder component also utilizes theDictionary Prefix Tree Parser module 3010 described above.

The Common Misspelling Prefix Tree Parser module 3105 searches thecommon misspellings prefix tree 2955 to determine whether a misspelledword is a common misspelling and to find the suggestions associated witha commonly misspelled word. The Misspelled Word Character Editor module3110 performs add/replace/delete on different characters of themisspelled word and uses the Dictionary Prefix Tree Parser 3010 todetermine whether the edited misspelled word matches a valid word in thedictionary prefix tree 2950. Once a set of suggestions is found by theCommon Misspelling Prefix Tree Parser module 3105 and the MisspelledWord Character Editor module 3110, the Suggestion Scorer module 3115scores the suggestions and saves a predetermined number of suggestion inthe list of suggestions 3150 to display to the user upon request. In theembodiments that utilize a look-up map for searching for commonmisspellings, the Common Misspelling Prefix Tree Parser 3105 is replacedby a Common Misspelling Look-Up module to search for commonmisspellings.

FIG. 32 conceptually illustrates the software architecture fordisplaying suggestions and receiving user selection for misspelledwords. The Spelling Suggestion Provider component 3200 includes aSuggestion Displayer module 3205, a User Input Interpreter module 3210,and a Misspelled Word Replacer module 3215.

The User Input Interpreter module 3210 receives user s input through theGraphical User Interface module 3220. When the request is forsuggestions for a misspelled word, the User Input Interpreter module3210 passes the request and the identification of the misspelled word toSuggestion Displayer module 3205.

The Suggestion Displayer module 3205 finds the suggestions for themisspelled word in the list of suggestions 3150 and displays thesuggestions through the Graphical User Interface module 3220. When therequest is the selection of a suggestion, the User Input Interpretermodule 3210 passes the request and the identification of the suggestionand the misspelled word to the Misspelled Word Replacer module 3205. TheMisspelled Word Replacer module 3205 replaces the misspelled word withthe selected suggestion through the Graphical User Interface module3220.

V. Electronic System

Many of the above-described features and applications are implemented assoftware processes that are specified as a set of instructions recordedon a computer readable storage medium (also referred to as computerreadable medium). When these instructions are executed by one or morecomputational or processing unit(s) (e.g., one or more processors, coresof processors, or other processing units), they cause the processingunit(s) to perform the actions indicated in the instructions. Examplesof computer readable media include, but are not limited to, CD-ROMs,flash drives, random access memory (RAM) chips, hard drives, erasableprogrammable read-only memories (EPROMs), electrically erasableprogrammable read-only memories (EEPROMs), etc. The computer readablemedia does not include carrier waves and electronic signals passingwirelessly or over wired connections.

In this specification, the term “software” is meant to include firmwareresiding in read-only memory or applications stored in magnetic storage,which can be read into memory for processing by a processor. Also, insome embodiments, multiple software inventions can be implemented assub-parts of a larger program while remaining distinct softwareinventions. In some embodiments, multiple software inventions can alsobe implemented as separate programs. Finally, any combination ofseparate programs that together implement a software invention describedhere is within the scope of the invention. In some embodiments, thesoftware programs, when installed to operate on one or more electronicsystems, define one or more specific machine implementations thatexecute and perform the operations of the software programs.

A. Mobile Device

Several applications such as the content authoring and publishingapplication, the digital content viewing application, and multimediamanagement application of some embodiments operate on mobile devices,such as smart phones (e.g., iPhones®), tablets and touchpads (e.g.,iPads®), or ebook readers (e.g., Kindle)). FIG. 33 is an example of anarchitecture 3300 of such a mobile computing device. As shown, themobile computing device 3300 includes one or more processing units 3305,a memory interface 3310 and a peripherals interface 3315.

The peripherals interface 3315 is coupled to various sensors andsubsystems, including a camera subsystem 3320, a wireless communicationsubsystem(s) 3325, an audio subsystem 3330, an I/O subsystem 3335, etc.The peripherals interface 3315 enables communication between theprocessing units 3305 and various peripherals. For example, anorientation sensor 3345 (e.g., a gyroscope) and an acceleration sensor3350 (e.g., an accelerometer) are coupled to the peripherals interface3315 to facilitate orientation and acceleration functions.

The camera subsystem 3320 is coupled to one or more optical sensors 3340(e.g., a charged coupled device (CCD) optical sensor, a complementarymetal-oxide-semiconductor (CMOS) optical sensor, etc.). The camerasubsystem 3320 coupled with the optical sensors 3340 facilitates camerafunctions, such as image and/or video data capturing. The wirelesscommunication subsystem 3325 serves to facilitate communicationfunctions. In some embodiments, the wireless communication subsystem3325 includes radio frequency receivers and transmitters, and opticalreceivers and transmitters (not shown in FIG. 33). These receivers andtransmitters of some embodiments are implemented to operate over one ormore communication networks such as a GSM network, a Wi-Fi network, aBluetooth network, etc. The audio subsystem 3330 is coupled to a speakerto output audio (e.g., to output user-specific questions for generatingthe escrow key). Additionally, the audio subsystem 3330 is coupled to amicrophone to facilitate voice-enabled functions, such as voicerecognition (e.g., for searching), digital recording, etc.

The I/O subsystem 3335 involves the transfer between input/outputperipheral devices, such as a display, a touch screen, etc., and thedata bus of the processing units 3305 through the peripherals interface3315. The I/O subsystem 3335 includes a touch-screen controller 3355 andother input controllers 3360 to facilitate the transfer betweeninput/output peripheral devices and the data bus of the processing units3305. As shown, the touch-screen controller 3355 is coupled to a touchscreen 3365. The touch-screen controller 3355 detects contact andmovement on the touch screen 3365 using any of multiple touchsensitivity technologies. The other input controllers 3360 are coupledto other input/control devices, such as one or more buttons. Someembodiments include a near-touch sensitive screen and a correspondingcontroller that can detect near-touch interactions instead of or inaddition to touch interactions.

The memory interface 3310 is coupled to memory 3370. In someembodiments, the memory 3370 includes volatile memory (e.g., high-speedrandom access memory), non-volatile memory (e.g., flash memory), acombination of volatile and non-volatile memory, and/or any other typeof memory. As illustrated in FIG. 33, the memory 3370 stores anoperating system (OS) 3372. The OS 3372 includes instructions forhandling basic system services and for performing hardware dependenttasks.

The memory 3370 also includes communication instructions 3374 tofacilitate communicating with one or more additional devices; graphicaluser interface instructions 3376 to facilitate graphic user interfaceprocessing; image processing instructions 3378 to facilitateimage-related processing and functions; input processing instructions3380 to facilitate input-related (e.g., touch input) processes andfunctions; audio processing instructions 3382 to facilitateaudio-related processes and functions; and camera instructions 3384 tofacilitate camera-related processes and functions. The instructionsdescribed above are merely exemplary and the memory 3370 includesadditional and/or other instructions in some embodiments. For instance,the memory for a smartphone may include phone instructions to facilitatephone-related processes and functions. Additionally, the memory mayinclude instructions for a keychain backup or restoration application aswell as other applications. The above-identified instructions need notbe implemented as separate software programs or modules. Variousfunctions of the mobile computing device can be implemented in hardwareand/or in software, including in one or more signal processing and/orapplication specific integrated circuits.

While the components illustrated in FIG. 33 are shown as separatecomponents, one of ordinary skill in the art will recognize that two ormore components may be integrated into one or more integrated circuits.In addition, two or more components may be coupled together by one ormore communication buses or signal lines. Also, while many of thefunctions have been described as being performed by one component, oneof ordinary skill in the art will realize that the functions describedwith respect to FIG. 33 may be split into two or more integratedcircuits.

B. Computer System

FIG. 34 conceptually illustrates an electronic system 3400 with whichsome embodiments of the invention are implemented. The electronic system3400 may be a computer (e.g., a desktop computer, personal computer,tablet computer, etc.), phone, PDA, or any other sort of electronic orcomputing device. Such an electronic system includes various types ofcomputer readable media and interfaces for various other types ofcomputer readable media. Electronic system 3400 includes a bus 3405,processing unit(s) 3410, a graphics processing unit (GPU) 3415, a systemmemory 3420, a network 3425, a read-only memory 3430, a permanentstorage device 3435, input devices 3440, and output devices 3445.

The bus 3405 collectively represents all system, peripheral, and chipsetbuses that communicatively connect the numerous internal devices of theelectronic system 3400. For instance, the bus 3405 communicativelyconnects the processing unit(s) 3410 with the read-only memory 3430, theGPU 3415, the system memory 3420, and the permanent storage device 3435.

From these various memory units, the processing unit(s) 3410 retrievesinstructions to execute and data to process in order to execute theprocesses of the invention. The processing unit(s) may be a singleprocessor or a multi-core processor in different embodiments. Someinstructions are passed to and executed by the GPU 3415. The GPU 3415can offload various computations or complement the image processingprovided by the processing unit(s) 3410.

The read-only-memory (ROM) 3430 stores static data and instructions thatare needed by the processing unit(s) 3410 and other modules of theelectronic system. The permanent storage device 3435, on the other hand,is a read-and-write memory device. This device is a non-volatile memoryunit that stores instructions and data even when the electronic system3400 is off. Some embodiments of the invention use a mass-storage device(such as a magnetic or optical disk and its corresponding disk drive) asthe permanent storage device 3435.

Other embodiments use a removable storage device (such as a floppy disk,flash memory device, etc., and its corresponding disk drive) as thepermanent storage device. Like the permanent storage device 3435, thesystem memory 3420 is a read-and-write memory device. However, unlikestorage device 3435, the system memory 3420 is a volatile read-and-writememory, such a random access memory. The system memory 3420 stores someof the instructions and data that the processor needs at runtime. Insome embodiments, the invention's processes are stored in the systemmemory 3420, the permanent storage device 3435, and/or the read-onlymemory 3430. For example, the various memory units include instructionsfor processing multimedia clips in accordance with some embodiments.From these various memory units, the processing unit(s) 3410 retrievesinstructions to execute and data to process in order to execute theprocesses of some embodiments.

The bus 3405 also connects to the input and output devices 3440 and3445. The input devices 3440 enable the user to communicate informationand select commands to the electronic system. The input devices 3440include alphanumeric keyboards and pointing devices (also called “cursorcontrol devices”), cameras (e.g., webcams), microphones or similardevices for receiving voice commands, etc. The output devices 3445display images generated by the electronic system or otherwise outputdata. The output devices 3445 include printers and display devices, suchas cathode ray tubes (CRT) or liquid crystal displays (LCD), as well asspeakers or similar audio output devices. Some embodiments includedevices such as a touchscreen that function as both input and outputdevices.

Finally, as shown in FIG. 34, bus 3405 also couples electronic system3400 to a network 3425 through a network adapter (not shown). In thismanner, the computer can be a part of a network of computers (such as alocal area network (“LAN”), a wide area network (“WAN”), or an Intranet,or a network of networks, such as the Internet. Any or all components ofelectronic system 3400 may be used in conjunction with the invention.

Some embodiments include electronic components, such as microprocessors,storage and memory that store computer program instructions in amachine-readable or computer-readable medium (alternatively referred toas computer-readable storage media, machine-readable media, ormachine-readable storage media). Some examples of such computer-readablemedia include RAM, ROM, read-only compact discs (CD-ROM), recordablecompact discs (CD-R), rewritable compact discs (CD-RW), read-onlydigital versatile discs (e.g., DVD-ROM, dual-layer DVD-ROM), a varietyof recordable/rewritable DVDs (e.g., DVD-RAM, DVD-RW, DVD+RW, etc.),flash memory (e.g., SD cards, mini-SD cards, micro-SD cards, etc.),magnetic and/or solid state hard drives, read-only and recordableBlu-Ray® discs, ultra density optical discs, any other optical ormagnetic media, and floppy disks. The computer-readable media may storea computer program that is executable by at least one processing unitand includes sets of instructions for performing various operations.Examples of computer programs or computer code include machine code,such as is produced by a compiler, and files including higher-level codethat are executed by a computer, an electronic component, or amicroprocessor using an interpreter.

While the above discussion primarily refers to microprocessor ormulti-core processors that execute software, some embodiments areperformed by one or more integrated circuits, such as applicationspecific integrated circuits (ASICs) or field programmable gate arrays(FPGAs). In some embodiments, such integrated circuits executeinstructions that are stored on the circuit itself. In addition, someembodiments execute software stored in programmable logic devices(PLDs), ROM, or RAM devices.

As used in this specification and any claims of this application, theterms “computer”, “server”, “processor”, and “memory” all refer toelectronic or other technological devices. These terms exclude people orgroups of people. For the purposes of the specification, the termsdisplay or displaying means displaying on an electronic device. As usedin this specification and any claims of this application, the terms“computer readable medium,” “computer readable media,” and “machinereadable medium” are entirely restricted to tangible, physical objectsthat store information in a form that is readable by a computer. Theseterms exclude any wireless signals, wired download signals, and anyother ephemeral signals.

While the invention has been described with reference to numerousspecific details, one of ordinary skill in the art will recognize thatthe invention can be embodied in other specific forms without departingfrom the spirit of the invention. In addition, a number of the figures(including FIGS. 13, 14, 16, 19, 21A, 21B, and 23) conceptuallyillustrate processes. The specific operations of these processes may notbe performed in the exact order shown and described. The specificoperations may not be performed in one continuous series of operations,and different specific operations may be performed in differentembodiments. Furthermore, the process could be implemented using severalsub-processes, or as part of a larger macro process. Thus, one ofordinary skill in the art would understand that the invention is not tobe limited by the foregoing illustrative details, but rather is to bedefined by the appended claims.

What is claimed is:
 1. A method of spell checking a document, the methodcomprising: at a client device, activating a browser-based applicationreceived from a server through a network; receiving a data structure fora dictionary comprising a list of correctly spelled words from theserver, the data structure utilized to prune the number of searches ofthe dictionary to match a word to strings associated with valid words inthe dictionary; and at the client device, determining a word as having acorrect spelling when the word matches a string associated with a validword in the data structure for the dictionary.
 2. The method of claim 1further comprising: determining whether a word that does not match anystring associated with a valid word in the data structure for thedictionary satisfies a set of rules for valid strings; and identifyingthe word as a misspelled word when no rule in the set of rules issatisfied.
 3. The method of claim 2 further comprising: receiving, fromthe server, a data structure for a list of common misspellingscomprising a set of strings representing common misspelled words, thecommon misspellings data structure utilized to prune the number ofsearches required to match a word to a string in the set of strings;receiving a misspelled word; and when the misspelled word matches astring in the common misspellings data structure, identifying a set ofsuggestions associated with the matched string to provide as suggestionsto correct the misspelled word.
 4. The method of claim 3, the clientdevice having a plurality of browsers, wherein spell checking andproviding suggestion for every word provides a same set of resultsregardless the browser used to activate the browser-based application.5. The method of claim 3, wherein the data structure for the dictionaryis a prefix tree, the prefix tree comprising a plurality of nodes in aparent-child hierarchical relationship, each node associated with acharacter, each node further associated with a string, all descendantsof a node having a common prefix of the string associated with thatnode.
 6. The method of claim 5 further comprising: changing a set ofcharacters in the misspelled word; after changing each characterdiscarding the change when a resulting string is not found in thedictionary prefix tree; and identifying a set of suggestions for themisspelled word when changing a set of characters in the misspelled wordresults a valid string in the dictionary prefix tree.
 7. The method ofclaim 6, wherein changing the set of characters comprises one of addinga character to the misspelled word, replacing a character in themisspelled word, and deleting a character in the misspelled word.
 8. Themethod of claim 6 further comprising: scoring the suggestions for themisspelled word based on a set of rules; and displaying up to apredetermined number of high scored suggestions.
 9. The method of claim8 further comprising assigning a highest score to any suggestions foundfor the misspelled word in the common misspelling prefix tree.
 10. Themethod of claim 8 further comprising assigning scores based on how manytimes a same suggestion is found for the misspelled word when changingthe set of characters in the misspelled word results finding the samevalid string in the dictionary prefix tree.
 11. The method of claim 8further comprising assigning scores based on a number of characters thatare changed in the misspelled word before changing a set of charactersin the misspelled word results a valid string in the dictionary prefixtree.
 12. The method of claim 3, wherein the common misspellings datastructure is a prefix tree, wherein the prefix tree comprises aplurality of nodes in a parent-child hierarchical relationship, eachnode associated with a character, each node further associated with astring, all descendants of a node having a common prefix of the stringassociated with that node, each of a plurality of nodes in the commonmisspellings prefix tree repressing a commonly misspelled word andassociated with a set of suggestions to correct the commonly misspelledword.
 13. The method of claim 3, wherein the dictionary data structureand the common misspellings data structure are received from the serveras encoded data structures, the method further comprising: at the clientdevice, decoding the dictionary data structure from the receiveddictionary encoded data structure; and at the client device, decodingthe common misspelling data structure from the received commonmisspelling encoded data structure.
 14. A machine-readable mediumstoring a program for spell checking a document on a client device, theprogram executable by at least one processing unit, the programcomprising sets of instructions for: activating a browser-basedapplication received from a server through a network; receiving a datastructure for a dictionary comprising a list of correctly spelled wordsfrom the server, the data structure utilized to prune the number ofsearches of the dictionary to match a word to strings associated withvalid words in the dictionary; and determining a word as having acorrect spelling when the word matches a string associated with a validword in the data structure for the dictionary.
 15. The machine-readablemedium of claim 14, the program further comprising sets of instructionsfor: determining whether a word that does not match any stringassociated with a valid word in the data structure for the dictionarysatisfies a set of rules for valid strings; and identifying the word asa misspelled word when no rule in the set of rules is satisfied.
 16. Themachine-readable medium of claim 15, the program further comprising setsof instructions for: receiving, from the server, a data structure for alist of common misspellings comprising a set of strings representingcommon misspelled words, the data structure for the common misspellingsutilized to prune the number of searches required to match a word to astring in the set of strings; receiving a misspelled word; andidentifying, when the misspelled word matches a string in the commonmisspellings data structure, a set of suggestions associated with thematched string to provide as suggestions to correct the misspelled word.17. The machine-readable medium of claim 16, the client device having aplurality of browsers, wherein spell checking and providing suggestionfor every word provides a same set of results regardless the browserused to activate the browser-based application.
 18. The machine-readablemedium of claim 16, wherein the data structure for the dictionary is aprefix tree, the prefix tree comprising a plurality of nodes in aparent-child hierarchical relationship, each node associated with acharacter, each node further associated with a string, all descendantsof a node having a common prefix of the string associated with thatnode.
 19. The machine-readable medium of claim 18, the program furthercomprising sets of instructions for: changing a set of characters in themisspelled word; discarding, after changing each character, the changewhen a resulting string is not found in the dictionary prefix tree; andidentifying a set of suggestions for the misspelled word when changing aset of characters in the misspelled word results a valid string in thedictionary prefix tree.
 20. The machine-readable medium of claim 19,wherein the set of instructions for changing the set of characterscomprises sets of instructions for (i) adding a character to themisspelled word, (ii) replacing a character in the misspelled word, or(iii) deleting a character in the misspelled word.
 21. Themachine-readable medium of claim 19, the program further comprising setsof instructions for: scoring the suggestions for the misspelled wordbased on a set of rules; and displaying up to a predetermined number ofhigh scored suggestions.
 22. The machine-readable medium of claim 21,the program further comprising a set of instructions for assigning ahighest score to any suggestions found for the misspelled word in thecommon misspelling prefix tree.
 23. The machine-readable medium of claim21 further comprising assigning scores based on how many times a samesuggestion is found for the misspelled word when changing the set ofcharacters in the misspelled word results finding the same valid stringin the dictionary prefix tree.
 24. The machine-readable medium of claim21, the program further comprising a set of instructions for assigningscores based on a number of characters that are changed in themisspelled word before changing a set of characters in the misspelledword results a valid string in the dictionary prefix tree.
 25. Themachine-readable medium of claim 16, wherein the common misspellingsdata structure is a prefix tree, wherein the prefix tree comprises aplurality of nodes in a parent-child hierarchical relationship, eachnode associated with a character, each node further associated with astring, all descendants of a node having a common prefix of the stringassociated with that node, each of a plurality of nodes in the commonmisspellings prefix tree repressing a commonly misspelled word andassociated with a set of suggestions to correct the commonly misspelledword.
 26. The machine-readable medium of claim 16, wherein thedictionary data structure and the common misspellings data structure arereceived from the server as encoded data structures, the program furthercomprising sets of instructions for: decoding the dictionary datastructure from the received dictionary encoded data structure; anddecoding the common misspelling data structure from the received commonmisspelling encoded data structure.